Text line segmentation of historical documents: a survey
نویسندگان
چکیده
منابع مشابه
A TaLISMAN: Automatic Text and LIne Segmentation of historical MANuscripts
Historical and artistic handwritten books are valuable cultural heritage (CH) items, as they provide information about tangible and intangible cultural aspects from the past. Massive digitization projects have made these kind of data available to a world-wide population, and pose real challenges for automatic processing. In this scenario, document layout analysis plays a significant role, being...
متن کاملA New Method for Text-Line Segmentation for Warped Documents
Bound documents either scanned or captured with digital cameras often present a geometrical warp that makes text-lines curled. The identification of text-lines is one of the steps for document de-warping when only a single image is available. This paper presents a new method for text-line segmentation. It is based on a simple, but effective, skew detector proposed by ÁvilaLins and simplifies th...
متن کاملText line and word segmentation of handwritten documents
In this paper, we present a segmentation methodology of handwritten documents in their distinct entities, namely, text lines and words. Text line segmentation is achieved by applying Hough transform on a subset of the document image connected components. A post-processing step includes the correction of possible false alarms, the detection of text lines that Hough transform failed to create and...
متن کاملText line segmentation in handwritten documents using Mumford-Shah model
Text line segmentation in handwritten documents is an important step in document processing. We present a new text line segmentation method based on the Mumford-Shah model. The algorithm is script independent. In addition, we use morphing to remove overlaps between neighboring text lines and connect broken ones. Experimental results show the validity of our method.
متن کاملHandwritten Text Recognition for Historical Documents
The amount of digitized legacy documents has been rising dramatically over the last years due mainly to the increasing number of on-line digital libraries publishing this kind of documents. The vast majority of them remain waiting to be transcribed into a textual electronic format (such as ASCII or PDF) that would provide historians and other researchers new ways of indexing, consulting and que...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Document Analysis and Recognition (IJDAR)
سال: 2006
ISSN: 1433-2833,1433-2825
DOI: 10.1007/s10032-006-0023-z